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Abstract 

The Steiner Traveling Salesman Problem (STSP) is a variant of 
the Traveling Salesman Problem (TSP) that is particularly suitable 
when dealing with sparse networks, such as road networks. The stan- 
dard integer programming formulation of the STSP has an exponential 
number of constraints, just like the standard formulation of the TSP. 
(~| ' On the other hand, there exist several known compact formulations of 

*li . the TSP, i.e., formulations with a polynomial number of both variables 

and constraints. In this paper, we show that some of these compact 
formulations can be adapted to the STSP. We also briefly discuss the 
adaptation of our formulations to some closely-related problems. 

^ I Keywords: traveling salesman problem, integer programming, ex- 

■^ ■ tended formulations. 
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^ ■ 1 Introduction 

m 

^^ ' The Traveling Salesman Problem (TSP), in its undirected version, can be 

defined as follows. We are given a complete undirected graph G = {V, E) 
and a positive integer cost Ce for each edge e a E. The task is to find a 
Hamiltonian circuit, or tour, of minimum total cost. The best algorithms 

r> ■ for solving the TSP to proven optimality, such as the ones described in 

d . [11 [251 132]) are based on a formulation of the TSP as a 0-1 linear program 

due to Dantzig et al. [7j, which we present in Subsection 12. II of this paper. 
The Dantzig et al. formulation has only one variable per edge, but has an 
exponentially-large number of constraints, which makes cutting-plane meth- 
ods necessary (see again [U [28l [32] ) . If one wishes to avoid this complication. 
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one can instead use a so-called compact formulation of the TSP, i.e., a for- 
mulation with a polynomial number of both variables and constraints. A 
variety of compact formulations are available (see the surveys [181 EHl EH 133] 
and also Subsection 12.21 of this paper). 

When dealing with routing problems on real-life road networks, however, 
one often encounters the following variant of the TSP. The graph G is not 
complete, not every node must be visited by the salesman, nodes may be 
visited more than once if desired, and edges may be traversed more than once 
if desired. This variant of the TSP was proposed, apparently independently, 
by three sets of authors [6l [131 130] ■ (The special case in which all nodes 
must be visited was considered earlier in [201 '26].) We will follow Cornuejols 
et al. [6] in calling this variant the Steiner TSP, or STSP for short. 

As noted in [6l [13], it is possible to convert any instance of the STSP 
into an instance of the standard TSP, by computing shortest paths between 
every pair of required nodes. So, in principle, one could use any of the above- 
mentioned TSP formulations to solve the STSP. If, however, the original 
STSP instance is defined on a sparse graph, the conversion to a standard 
TSP instance increases the number of variables substantially, which may be 
undesirable. For this reason, we have decided in this paper to present and 
analyse some compact formulations for the STSP. 

The paper is structured as follows. We review the relevant literature 
on TSP and STSP formulations in Section [2j In Section [3l we show how 
to adapt so-called commodity-flow formulations of the TSP to the Steiner 
case, and make some remarks about the relative strength of the resulting 
formulations. In Section 31 we adapt the so-called tme-staged formulation 
of the TSP to the Steiner case, and present a key theorem, which enables 
one to reduce the number of variables substantially. Then, in Section [5l we 
briefly discuss the possibility of adapting our compact formulations to some 
other vehicle routing problems, when sparse graphs are involved rather than 
complete graphs. Finally, some concluding remarks appear in Section [6l 

2 Literature Review 

We now review the relevant literature. We cover the classical formulation 
of the standard TSP in Subsection 12. H compact formulations of the stan- 
dard TSP in Subsection 12.21 and the classical formulation of the STSP in 
Subsection 12.31 



2.1 The classical formulation of the standard TSP 

The classical and most commonly-used formulation of the standard TSP is 
the following one, due to Dantzig, Fulkerson and Johnson [7J: 

s-t. Zeesi{i})^e = 2 (yieV) (1) 

EeeSiS)^e>2 {ySCV:2<\S\<\V\/2) (2) 

XeG{0,l} (VeE^). 

Here, Xg is a binary variable, taking the value 1 if and only if the edge e 
belongs to the tour, and, for any S C V, 5{S) denotes the set of edges having 
exactly one end- node inside S. The constraints ([T]), called degree constraints, 
enforce that the tour uses exactly two of the edges incident on each node. 
The constraints ([2]), called subtour elimination constraints, ensure that the 
tour is connected. 

We will call this formulation the DFJ formulation. A key feature of this 
formulation is that the subtour elimination constraints ([2]) are exponential 
in number. 

2.2 Compact formulations of the standard TSP 

As mentioned above, a wide variety of compact formulations exist for the 
standard TSP, and there are several surveys available (e.g., [181 [291 [SU [33]). 
For the sake of brevity, we mention here only four of them. All of them 
start by setting V = {1, 2, . . . n} and viewing node 1 as a 'depot', which the 
salesman must leave at the start of the tour and return to at the end of the 
tour. Moreover, all of them can be used for the asymmetric TSP as well as 
for the standard (symmetric) TSP. 

We begin with the formulation of Miller, Tucker & Zemlin [27 1, which 
we call the MTZ formulation. For all node pairs {i,j), let Xij be a binary 
variable, taking the value 1 if and only if the salesman travels from node i 
to node j. Also, for i = 2, . . . , n, let Ui be a continuous variable representing 
the position of node i in the tour. (The depot can be thought of as being at 
positions and n.) The MTZ formulation is then: 



min 


Z^iJ = l CijXij 




(3) 


s.t. 


EU^^. = i 


(1 < z < n) 


(4) 




E-=ii.. = i 


{l<i<n) 


(5) 




Xij G {0, 1} 


(1 <i,j <n;i / j) 


(6) 



Ui — Uj + (n — l)xij < n — 2 {2 < i,j < n;i ^ j) (7) 

l<Ui<n-l (2<i<n). (8) 

The constraints dH) and ((S}) ensure that the salesman arrives at and departs 
from each node exactly once. The constraints ([7]) ensure that, if the salesman 



travels from i to j, then the position of node j is one more than that of node 
i. Together with the bounds ([8]), this ensures that each non-depot node is 
in a unique position. 

The MTZ formulation is compact, having only O(n^) variables and 
0{n^) constraints. Unfortunately, Padberg &: Sung [33] show that its LP 
relaxation yields an extremely weak lower bound, much weaker than that of 
the DFJ formulation. 

The next compact formulation, historically, was the 'time-staged' (TS) 
formulation proposed by both Vajda [33] and Houck et al. |5T] indepen- 
dently. For all 1 < i,j,k < n with i ^ j, let r^, be a binary variable taking 
the value 1 if and only if the edge {i,j} is the kth edge to be traversed in 
the tour, and is traversed in the direction going from i to j. We then have: 

i=2 CliTu + }2k=2 22i,j=2 Cijrij + Ei=2 CilTa 

s-t. E-=2-!, = l (9) 

E -=2^^1 = 1 (10) 

Efc=lE,y.-^. = l i2<^<n) (11) 

E,y.4 = EJ^^r^J'' {2<^<n■,l<k<n-l) (12) 
r,^G{0,l} {l<i,j,k<n;i^j). 

The constraints Q and ()10p state that the salesman must leave the depot 
at the start of the tour and return to it at the end. The constraints (|lip 
ensure that the salesman arrives at each non-depot node exactly once, and 
the constraints (J12p ensure that the salesman departs from each node that 
he visits. 

The TS formulation has 0{n'^) variables and O(n^) constraints. It fol- 
lows from results in |181 [33] that the associated lower bound is intermediate 
in strength between the MTZ and DFJ bounds. 

Next, we mention the single- commodity flow (SCF) formulation of Gav- 
ish &: Graves |15] . Imagine that the salesman carries n — 1 units of a com- 
modity when he leaves node 1 , and delivers 1 unit of this commodity to each 
other node. Let the Xij variables be defined as above, and define additional 
continuous variables Qij , representing the amount of the commodity (if any) 
passing directly from node i to node j. The formulation then consists of the 
objective function ([S]), the constraints ([1])-(ISD, and the following constraints: 

E-=i*-..-E"=2 5i, = l {2<^<n) (13) 

< gij < (n - l)xij {l<i,j <n;j ^i). (14) 

The constraints (|13p ensure that one unit of the commodity is delivered to 
each non-depot node. The bounds ()14p ensure that the commodity can flow 
only along edges that are in the tour. 

The SCF formulation has 0{v?) variables and 0{n) constraints. It is 
proved in [33j that the associated lower bound is intermediate in strength 



between the MTZ and DFJ bounds. Later on, in [18], it was shown that it 
is in fact intermediate in strength between the MTZ and TS bounds. 

Finahy, we mention the multi- commodity flow (MCF) formulation of 
Claus [5]- Here, we imagine that the salesman carries n — 1 commodities, 
one unit of each for each customer. Let the Xij variables be defined as 
above. Also define, for all 1 < i,j < n with i ^ j and all 2 < k < n, 
the additional continuous variable //?•, representing the amount of the kth 
commodity (if any) passing directly from node i to node j. The formulation 
then consists of the objective function Q, the constraints dH)-®, and the 
following constraints: 

0<f^j<iij (/c = 2,...,n;{i,i}c {!,..., n}) (15) 

Er=2/i'=l {k = 2,...,n) (16) 

Er=i/i = l ik = 2,...,n) (17) 

Er=i/^-Er=2/^ = {k = 2,...,n;j€{2,...,n}\{k})- (18) 

The constraints (|15p state that a commodity cannot flow along an edge 
unless that edge belongs to the tour. The constraints (fT6|) and (fT7|l impose 
that each commodity leaves the depot and arrives at its destination. The 
constraints (jlSp ensure that, when a commodity arrives at a node that is 
not its final destination, then it also leaves that node. 

The MCF formulation has O(n^) variables and 0{n^) constraints. It is 
proved in [33] that the associated lower bound is equal to the DFJ bound. 
Therefore, this is the strongest of the four compact formulations mentioned. 

2.3 The classical formulation of the STSP 

In the STSP, G = {V, E) is permitted to be a general graph, and a set 
Vr C y of required nodes is specified. The formulation given in [13j is as 
follows: 

min Y^edE^eXe (19) 

s-t. Eee5(5)^e>2 {S dV : S r\VR ^ %,Vr\S ^ ^) (20) 

Eee5(i) ^e even (i G V) (21) 

Xe G Z+ (e € E). (22) 

Note that the x variables are now general-integer variables. Note also that 
the parity conditions (|2ip are non-linear. (They can be easily linearised, 
using one additional variable for each node.) The crucial point, however, is 
that there are an exponential number of the connectivity constraints ()20p . 

3 Flow-Based Formulations of the STSP 

In this section, we adapt the formulations SCF and MCF, mentioned in 
Subsection 12.21 to the Steiner case. We also give some results concerned 



with the strength of the LP relaxations of our formulations. 

3.1 Some notation and a useful lemma 

At this point, we present some additional notation. Let G = {V, A) be a 
directed graph, where the set of directed arcs A is obtained from the edge 
set E by replacing each edge {i,j} with two directed arcs (i,j) and (j, i). 
For each arc a € j4, the cost Ca is viewed as being equal to the cost of the 
corresponding edge. For any node set S dV, let 5'^{S) denote the set of 
arcs in A whose tail is in S and whose head is in 1/ \ 5, and let 5~ (5) denote 
the set of arcs in A for which the reverse holds. For readability, we write 
(5+(i) and 5^{i) in place of 5'^{{i}) and 5~{{i}), respectively. Finally, let 
nR = |Vr| denote the number of required nodes. 
We will find the following lemma useful: 

Lemma 1 In an optimal solution to the STSP, no edge will he traversed 
more than once in either direction. 

This lemma is part of the folklore, but an explicit proof can be found in the 
appendix of [23]. 

Using this fact, one can define a binary variable Xa for each arc a G A, 
taking the value 1 if and only if the salesman travels along a. 

3.2 An initial single-commodity flow formulation 

Without loss of generality, assume that node 1 is required. By analogy with 
the case of the standard TSP, we imagine that the salesman departs the 
depot with ur — 1 units of the commodity, and delivers one unit of that 
commodity to each required node. So, for each arc a ^ A, let the new 
variable ga represent the amount of the commodity passing through a. The 
single-commodity flow formulation (SCF) may then be adapted to the sparse 
graph setting as follows: 

™in Y.a&A ^aXa (23) 

s-t. Eae5+W^a>l (yieVR) (24) 

J2ae5+{i) ^a = J2ae5-{i) ^a (Vi G V) (25) 

EaeS-i^)9a-Eaesni)9a = ^ (Vi G ^ij \ {!}) (26) 

J:a<,S-i^)9a-EaeSn^)9a = (Vi G F \ ^fi) (27) 

0<ga<{nR- l)xa (Va G A) (28) 

XaG{0,l} (VaG^). (29) 

The constraints (|24p ensure that the salesman departs from each required 
node at least once, and the constraints (j25p ensure that the salesman departs 
from each node as many times as he arrives. The constraints (|26p impose 



that one unit of the commodity is dehvered to each required node, and 
the constraints ()27p ensure that the amount of commodity on board when 
leaving a non-required node is equal to the amount when arriving. The 
bounds (j28p ensure that, if any of the commodity passes along an arc, then 
that arc appears in the tour. 

This formulation contains 0(|i?|) variables and Od^^l) constraints. 

Using a technique due to Gouveia [T7j, we can project this formulation 
into the space of the x variables: 

Theorem 1 Let {x*,g*) G [0, l]'^' x m'^ ' be a feasible solution to the LP 
relaxation of the formulation l^) -/[gp |] . Let x* be the corresponding point 
in [0,2]l-^l defined by setting x*j = x*j + x*^ for all {i,j} G E. Then x* 
satisfies all of the following linear inequalities: 

^^^>2^^ {yscv\{i}:SnVR^$). (so) 

Proof. If we sum the constraints (|26p over all i £ S DVu, together with 
the constraints ([27|1 over all i G 5 \ Vr, we obtain: 

ae<5-(5) ae<5+(5) 

Together with the bounds ([25]) . this implies: 

{UR-I) ^ Xa>\SnVR\, (31) 

a£S-(S) 



Now, the equations (j25|) imply: 



aeS-(S) a£5+{S) 



/From (1311) and (1321) we obtain: 



E\iJ VR 
Xa>2^- f 
Ur — 1 

aeS- {S)US+ (S) 



SnyRl 

TiR 



The result then follows from the construction of x*. D 

Note that the inequalities (j30p are weaker than the connectivity inequal- 
ities ()20p . As a result, the lower bound associated with the SCF formulation 
(|23 p - (j29p cannot be better than the one associated with Fleischmann's for- 
mulation (fT9])-(l22]). 



3.3 Strengthened single-commodity flow formulation 

It is possible to strengthen the SCF formulation given in the previous sub- 
section. Note that one can assume that, if any required node is visited more 
than once by the salesman, then the commodity is delivered on the first 
visit. Accordingly, for each node i (zV\ {!}, let rj be the minimum number 
of required nodes (not including the depot) that the salesman must have 
visited when he leaves i for the first time. Also, by convention, let ri = 0. 
(Note that one can compute rj for alH G F \{1} efficiently, using Dijkstra's 
single-source shortest-path algorithm |8j). Now, the constraints ()28p can be 
replaced with the following stronger constraints: 

< gij < {UR -n- l)xij (V(i, j) G A). (33) 

This makes the projection into x-space stronger, as expressed in the following 
theorem: 

Theorem 2 Let {x*,g*) be a feasible solution to the LP relaxation of the 
formulation (M)-^, (E^), ^. Also, for any set S Q V \ {1} such 
that 5 n Vr 7^ 0, let T(S) be the set of all nodes that are not in S but are 
adjacent to at least one node in S. Finally, define L{S) = min-jgyrj and 
U{S) = maxjgTrj. Then, the point x* corresponding to {x*,g*) satisfies the 
following inequality for all such sets S and for k = L{S), . . . , U{S): 

{uR-k-l) "^ Xe + 2 Y^ max{0,k-ri}xij>2\SnVR\. (34) 
ems) {i,j}es(sy.jes 

Proof. As in the proof of Theorem [H the constraints ()26p and ()27p imply: 

Y. 9a= Yl 9a + \SnVR\. 
aeS-{S) ae<5+(5) 

Using the strengthened bounds ([33]) . this implies: 

Y inR-ri-l)xij>\SnVR\. 
(i,J)&S-{S) 

We can re- write this as: 

(uR-k-l) Y ^ij + X^ {k-ri)xij >\SnVR\. 
(ij)e5-(5) {i,j)€5-{S) 

Together with non-negativity on x this implies: 

(uR-k-l) Y ^ij + X^ meix{0, k - ri}{xij +Xji) >\S CiVrI. 
(ij)e5-(5) (ij)e<5-(5) 



The result then follows from the identity (j32p and the construction of x* . D 



Our experiments on small instances lead us to conjecture that the in- 
equalities dMI, together with the bounds x G [0,2] 1^1, give a complete de- 
scription of the projection into x-space. 

Note that, if one sets k = L(S) in Theorem [2l one obtains the following 
family of inequalities: 

S-'^ ^nH-L(g)'-l) (VScn{l}:SnVH^»)^ 

e£S{S) 

Since L{S) cannot exceed nji — l — \SriVji\, these inequalities are intermediate 
in strength between the inequalities ([30|1 and the connectivity inequalities 
(|20p . Accordingly, we conjecture that the lower bound from the strength- 
ened SCF formulation always lies between the one from the original SCF 
formulation and the one from Fleischmann's formulation. 

We also remark that one could tighten the constraints ()33p further for 
the arcs that are incident on the depot. Indeed, in an optimal solution, the 
salesman would never depart from the depot without at least one unit of 
the commodity, and would never arrive at the depot with more than n/j — 2 
units of the commodity. One can check, however, that this further tightening 
in the {x,g)-space does not lead to any improvement in the resulting valid 
inequalities in the x-space. 

3.4 Multi-commodity flow formulation 

Similar to the MCF formulation for the standard TSP, we assume that the 
salesman leaves the depot (node 1) with one unit of commodity for each 
required node. Accordingly, let the binary variable /^ be 1 if and only if 
commodity k passes through arc a, for every k € Vr \ {1} and a €z A. 
The resulting formulation then consists of minimising ()23p subject to the 
following constraints: 

EaeSn^)^a>l (Vi G ^i?) (35) 

Eae5+ (i) ^a = T.aeS~ (i) ^a (Vi G F ) (36) 

Eae5- w /a - Eae5+W /a =0 (Vi G F \ {!}; k^VR\ {1, i}) (37) 

Eae^- W /a - Eae^+W ^^ =^ i^k G Vr\ {!}) (38) 

Eae5-(i)/a-Eae5+(i)/a=-l i^k € Vr\ {1}) (39) 

Xa > f^ (Va eA;keVR\ {!}) (40) 

Xa G {0, 1} (Va G A) (41) 

/a' e {0, 1} (Va G Akk G Vr \ {!}). (42) 

The constraints are interpreted along similar lines to those of the formula- 
tions already seen. 



This MCF formulation has 0{nji\E\) variables and 0{nji\E\) constraints. 
As for the projection into the space of x variables, we have the following re- 
sult: 

Proposition 1 Let (x* , f*) be a feasible solution to the LP relaxation of 
the MCF formulation. Let x* be the corresponding point in [0,2]'^' defined 
by setting x*- = x*- + x*^ for all {i, j} S E. Then x* satisfies all of the the 
connectivity inequalities i20\). 



Proof. For a fixed node k G Vr\ {!}, the constraints ([37|) - (|l0]l . together 
with the well-known max- flow min-cut theorem |14] imply the following 
exponentially-large family of inequalities: 

Y, ^:>l {^ScVR\{l}:k(^S). 

a£&+{S) 



The equations (j36|) then imply: 

Y, i:>2 {^SciVR\{l]:keS). 

aG5+{S)U<5-(5) 

Next, the relationship between x* and x* gives 

Y xl>2 (VScyfl\{l}:fcG5). 

Applying this for all /c G Vr \ {1} yields the result. El 

This result implies that the lower bound from this MCF formulation is 
no worse than the one from Fleischmann's formulation. We conjecture that 
the two bounds are equal. 

4 Time-Staged Formulations of the STSP 

In this section, we adapt the TS formulation for the standard TSP, men- 
tioned in Subsection 12. 2t to the Steiner case. A simple formulation is pre- 
sented in the following subsection. A method to reduce the number of vari- 
ables is presented in Subsection 14. 2i Then, in Subsection 14.31 we evaluate 
the total number of variables and constraints in each of the formulations 
that we have considered. 

4.1 An initial time-staged formulation 

In this context, it is natural to have one time stage for each time that an 
edge of G is traversed (in either direction). In terms of the classical STSP 
formulation given in Subsection 12.31 the total number of time stages will 

10 



then be equal to X^es-E^e- The problem here is that we do not know this 
value in advance. Observe, however, that Lemma [J implies that it cannot 
exceed 2\E\. 

Now, let A be defined as in Subsection 13.11 and recall that \A\ = 2\E\. 
For all a € ^ and all 1 < fc < |^|, let the binary variable r^ take the value 
1 if and only if arc a is the fcth arc to be traversed in the tour. Our TS 
formulation for the STSP is as follows: 

min 

s.t. 



E 







(43) 


2Jae<5+(l)^a = 1 




(44) 


ri = {aGA\6+il)) 




(45) 


.\A\ ^-^ ^ Y^l^ Y^ k 

■fc=l 2^ae<5+(l) ^a ~ A^fc=l Z_yae<5-(1) '''a 




(46) 


E[%EaeSH^)^a>l (y^^Vn) 




(47) 


Eaes-i.:) r'a = ZaeS^i) 'a^' (Vi e y; fc = 1, . . 


•,l^|- 


1)(48) 


r^ G {0, 1} (Va E A, fc = 1, . . 


•,l^l)- 


(49) 



Constraints (|44|) and (|45|) ensure that the salesman departs from the depot 
in the first time stage, and constraint ()46p ensures that he arrives at the 
depot as many times as he leaves it. Constraints (j47|) ensure that each 
required node is visited at least once. Constraints (08]) ensure that, if the 
salesman arrives at a non-depot node in any given time stage, then he must 
depart from it in the subsequent time stage. Finally, constraints ()49p are 
the usual binary conditions. 

This TS formulation has 0(|£^p) variables and 0{n\E\) constraints. We 
conjecture that the lower bound from this TS formulation always lies be- 
tween the one from our strengthened SCF formulation and the one from 
Fleischmann's formulation. 

4.2 Bounding the number of edge traversals 

Clearly, one could reduce the number of variables and constraints in the 
above TS formulation if one had a better upper bound on the total number 
of times that the salesman traverses an edge of G. The following theorem 
provides such a bound: 

Theorem 3 For every instance of the STSP which has a solution, there 
exists an optimal solution in which the total number of edge traversals (in 
either direction) does not exceed '2{\V\ — 1). 

For the proof of this theorem, we will use the following lemma. 

Lemma 2 If H is a connected graph on k nodes which has more than 2{k — 
1) edges, then there exists a cycle C in H such that the graph arising when 
the edges of C are deleted from H is still connected. 
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Proof. Let T be a spanning tree in H, and let H' be the graph resulting 
if the edges of T are deleted from H. For the number f.' of edges of H' we 
have i' = i— (k — l), which, by the hypothesis in the lemma, is greater than 
k — 1. Clearly, the number of nodes of H' is equal to k. 

Now, let T' be a spanning forest in H'. Note that H' may fail to be 
connected. Firstly, if one of the connected components of H' contains an 
edge e other than those in T', then let C be the cycle defined by taking e 
and the path in T' connecting the end-nodes of e. Clearly, deleting the edges 
of C from H leaves a connected graph because connectivity is assured by 
the tree T. 

But, secondly, it is impossible that all connected components of H' con- 
tain no other edges except those in T': In that case, H' would be a forest, 
and hence have at most k — l edges. But the number of edges of H' is greater 
than A; — 1, a contradiction. D 

We can now complete the proof of the theorem. 

Proof of Theorem \3[ Let x be an optimal solution to the STSP, which 
has, among all optimal solutions, the smallest number of edge traversals. 

Construct a graph H by starting with the node set V, and precisely Xe 
copies of the edge e, for all e a E. Then delete every isolated node from H. 
The number of nodes k oi H is at most \V\, and the number of edges is 

For the sake of contradiction, we assume that i > 2(\V\ — 1). If that is 
the case, then Lemma [21 is applicable. Let C be a cycle with the property 
given in the lemma, and let F be its edge set. For every e a E, denote by ye 
the number of times the edge e occurs in C. The fact that after deleting 
the edges of C from H, a connected graph remains, implies that x — y is a, 
solution to the STSP, whose total cost is at most that of x. Thus, x — y is 
an optimal solution in which the total number of edge traversals is smaller 
than in x, contradicting the choice of x. 

Thus, we conclude that Y.eXe = i< 2(|y| - 1). D 

An immediate consequence of this theorem is that one does not need to 
define the variables r^ in the TS formulation when k > 2{\V\ — 1). The 
constraints in which k > 2{\V\ — 1) can be dropped as well. As a result, the 
number of variables and constraints in the TS formulation can be reduced 
to 0{n\E\) and O(n^), respectively. We conjecture that this reduction in 
size has no efi^ect on the associated lower bound. 

4.3 Summary 

Table [T] displays, for each of the STSP formulations that we have considered, 
bounds on the total number of variables and constraints. Here, 'classical' 
refers to the formulation of Fleischmann il3l mentioned in Subsection 12. 3[ 
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Formulation Classical SCF MCF TSl TS2 

Variables \E\ 0{\E\) 0{nR\E\) 0{\E\^) 0{n\E\) 

Constraints 0(2"^) ol\E\) 0(n/;|^|) 0{n\E\) Ojn'^) 

Table 1: Alternative STSP formulations and their size 

'SCF' refers to either of the single-commodity flow formulations given in 
Subsections 13.21 and 13.31 'MCF' refers to the multi-commodity flow formu- 
lation given in Subsection 13.41 'TSl' refers to the time-staged formulation 
given in Subsection 14. H and 'TS2' refers to the reduced time-staged formu- 
lation given in Subsection 14. 2i 

Observe that, in the case of real road networks, the graph G is typically 
very sparse, and we have \E\ = 0{\V\). Then, any of the new formulations 
could potentially be used in practice. We would recommend using MCF or 
TS2 for small or medium-sized instances, due to the relative tightness of the 
bound, and SCF for large instances, due to the extremely small number of 
variables and constraints. 

Observe that we have not adapted the MTZ formulation to the Steiner 
case. This is because the MTZ formulation is based on the idea of deter- 
mining the order in which the nodes are visited. Since nodes can be visited 
multiple times in the Steiner case, a unique order cannot be determined. As 
a result, it does not appear possible to adapt the MTZ formulation. This is 
not a problem, though, given the extreme weakness of the MTZ formulation 
mentioned in Subsection! 



5 Some Related Problems 

Many variants and extensions of the TSP have appeared in the literature, 
such as the Orienteering Problem (e.g., [HI [121 US])) the Prize-Collecting 
TSP (e.g., [audi]), the Capacitated Profitable Tour Problem (e.g., [Tn[22]). 
the Generalized TSP (e.g., [IH [35]), the TSP with Time Windows (e.g., 
[21 [9]) and the Sequential Ordering Problem [10] . For each of these problems, 
it is easy to define a 'Steiner' version. It suffices to define the problem on 
a general graph G = iV^E), designate node 1 as the 'depot', define a set 
Vr C y \{1} of 'customer' nodes, permit edges to traversed more than once 
if desired, and permit nodes to be visited more than once if desired. 

In this section, we explore possible ways to formulate these other prob- 
lems of 'Steiner' type. For the sake of brevity, however, we restrict attention 
to three specific problems, which we call the Steiner Orienteering Problem, 
the Steiner Capacitated Profitable Tour Problem, and the Steiner TSP with 
Time Windows. These are considered in the following three subsections. 
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5.1 The Steiner Orienteering Problem 

We define the Steiner Orienteering Problem (SOP) as follows. For each 
e € E, we are given a non-negative cost Cg. For each i G Vr, we are given 
a positive revenue (or 'prize') pi. The nodes in Vr do not all have to be 
visited, but the revenue can only be collected from such a node if that node 
is visited at least once. We are also given an upper bound U on the total 
route cost. The task is to maximise the sum of the prizes collected, subject 
to the upper bound. 

Observe that Lemma [1] applies to the SOP. To see this, let V* C Vr 
be the set of nodes whose prizes are collected in the optimal solution. The 
optimal solution is then also optimal for a STSP instance defined on the 
same graph, but with Vr set to V*. 

Knowing that Lemma [1] applies, it is easy to adapt the classical (non- 
compact) formulation of the STSP, presented in Subsection 12.31 to the SOP. 
For each z E Vr, we define a new binary variable yi, taking the value 1 if 
and only if the salesman collects a prize from node i. We then change the 
objective function from (|19p to: 

max ^ piyi, (50) 

i€VR 



replace the connectivity constraints ()20p with: 

^ Xe>2yi {ieVR,SCV\{l}:i£S), (51) 

££(5(5) 

and add the route-cost constraint 

Y^ CeXe < U. (52) 

It is also easy to adapt the TS formulation of the STSP (Subsection 14. ip 
to the SOP. It suffices to add the yi variables mentioned above, change the 
objective function from (I44p to ()50p . add the route-cost constraint 



k=l aeA 

and replace the constraints (|47p with the constraints 

E E ''a>y^ (yi&Vn). (53) 

k=laeS+(i) 

Moreover, Theorem El given in Subsection 14. H applies to the SOP as well 
(for the same reason that Lemma [1] applies) . So one can reduce the number 
of stages to 2{\V\ — 1), without losing any optimal solutions. 
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It is also easy to adapt the MCF formulation of the STSP (Subsection 
I3.4p to the SOP. It suffices to add the same yi variables, change the objection 
function from ([23]) to ([50]) . change the right-hand sides of constraints ([35]) 
and (j38|) from 1 to yj, change the right-hand sides of constraints (j39|) from 
— 1 to —yi, and add the route-cost constraint: 

As for the SCF formulation of the STSP (Subsection [3^. there is an 
elegant way to adapt it to the SOP, which leads to an LP relaxation with 
desirable properties. The key is to redefine the continuous variables ga, so 
that: 

• if arc a is traversed (i.e., Xa = 1), then ga represents the total cost 
accumulated so far when the salesman begins to traverse the arc 

• if arc a is not traversed (i.e., Xa = 0), then ga = 0. 

One this is done, one can introduce the same additional y^ variables, and 
use the objective function (j50p . along with the following constraints: 

Eae5+(i) ^« ^ yi (^^ ^ ^r) (56) 

Eae5+ W ^a = Y.ae5- (i) ^a (H ^ V) (57) 

J2aeS+{t)9a-J2aeS-{i)9a = EaeS-{t)('aXa (Vi G F \ {1})) (58) 

0<ga<{U- Ca)S:a (Va G A) (59) 

Xa e {0, 1} (Va G A) (60) 

yiG{0,l} (ViGFi?). (61) 

We then have the following analogue of Theorem [T] 

Proposition 2 Let {x*,g*,y*) G [0, l]l^l x M^' x [0, !]"« satisfy the con- 
straints ^5^-^5S\)- Let X* G [0, 2] 1^1 be defined by setting x*j = x*j + x*j for 
all {i,j} G E. Then {x*,y*) satisfies all of the following linear inequalities: 

u Y xe> YI ^<^^<= (V5cy\{i}:Sn 1/^/0). 

ee<5(5) {i,j}eE:{i,j}nSj^(l) 

Proof. Similar to the proof of Theorem [TJ D 

As in the case of the STSP (Subsection 13. 3p . it is possible to strengthen 
this SCF formulation of the SOP. Indeed, if a given arc (i, j) is traversed, 
then the smallest value that gij can take is equal to the cost of the shortest 
path from the depot to node i. Similarly, the largest value that g^j can take 
is equal to f7 — Cij minus the cost of the shortest path from node j to the 
depot. One can adjust the constraints ([MD accordingly, and then derive a 
stronger projection result, analogous to Theorem [2l We omit details, for the 
sake of brevity. 
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5.2 The Steiner Capacitated Profitable Tour Problem 

The Steiner Capacitated Profitable Tour Problem (SCPTP) is similar to the 
SOP, but with the following differences: 

• We are given a positive demand qi for each i G Vr, in addition to the 
revenue pi. 

• If we wish to gain the revenue for a given i G Vr, then we have to 
deliver the demand qi. 

• Instead of an upper bound U on the route cost, we are given a vehicle 
capacity Q, which does not exceed the sum of the demands. The total 
demand of the serviced customers must not exceed Q. 

• The task is to find a tour of maximum total profit, where the profit is 
defined as the sum of the revenues gained, minus the cost of the edges 
traversed. 

Observe that Lemma [1] applies to the SCPTP, for the same reason that 
it applies to the SOP. Then, one can easily adapt the classical formulation 
of the STSP to the SCPTP. We use the same binary variables yi as used in 
the previous subsection, change the objective function from (J19p to 



max > PiVi — ^ I 



ieVn eeE 



replace the connectivity constraints ([^U]) with the constraints (j^T]) . and add 
the capacity constraint 

Y. ^iV^ ^ Q- (62) 

i€VR 

One can adapt the TS formulation in a similar way. It suffices to add 
the same yi variables, add the capacity constraint ([62|) . change the objective 
function ([S]) to 

max Y^ piyi -^Y '^«^a' 
igVfl fc=l aeA 

and replace the constraints (j47p with the constraints (j53p . Moreover, The- 
orem [3] is again applicable, and one can reduce the number of stages to 

2(|y|-i). 

As for the SCF formulation, we propose again to redefine the continuous 
variables Qa- Now, ga represents the total load (if any) that is carried along 
the arc a. Then, again using the additional yi variables, it suffices to: 



y^ PiVi - Y, '^"^" (63) 



max 
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subject to the following constraints: 

Eae5+(l)5a>l (64) 

Eae5+(i) ^a > Vi (yi G Vr) (65) 

J2aeS+{i) ^a = Eaes-{i) ^a (Vi G V) (66) 

^ae5+{l)9a-Eae5-il)9a<Q (67) 

J2ae5- (i) 9a - Eae5+ (t) 9a = QiVt (Vi G Vr) (68) 

Eae5-(i) 9a - HaeSHi) 9a = ^ (V^ G F \ {Vr U {!})) (69) 

< 5a < Qxa (Va G ^) (70) 

Xa G {0, 1} (Va G A) (71) 

yiG{0,l} (ViG^R). (72) 

The analogue of Theorem [1] is now as follows: 

Proposition 3 Let {x*,g*,y*) G [0,1] 1^1 xR^^^ x [0, 1]"« satisfy ^-(UW- 
Let X* G [0, 2] 1^1 be defined by setting x*- = x*j + x*- for all {i,j} G E. Then 
{x*,y*) satisfies all of the following linear inequalities: 

^ ^^ ^ 25ie5nv^R^ (yScV\{l}:SnVR^fl>). (73) 

e£5{S) ^ 

Proof. Similar to the proof of Theorem [TJ D 



Moreover, if one sums together the constraints (j67p - (|69p . one obtains the 
capacity constraint ([62]) . So the capacity constraint does not need to be 
added to this SCF formulation. 

As for the MCF formulation described in Subsection [231 we propose to 
redefine the binary variables /^ to be 1 if and only if q^ units of commodity 
k pass through arc o. Then, using the same additional yi variables, it suf- 
fices to change the objection function to (|63p. change the right-hand sides 
of constraints ([35]) and ([38]) from 1 to y,, change the right-hand sides of 
constraints ([39p from —1 to —yi, and add the constraints: 

YI Ik fa < QS^a (Va G A). 

k&Vn 

It can be shown that the projection of this formulation into (x, y) space 
satisfies the inequalities ([^T|) and ([75]) . along with the capacity constraint 
(|62p . We omit the details for brevity. 



5.3 The Steiner TSP with Time Windows 

Finally, we define the Steiner Traveling Salesman Problem with Time Win- 
dows (STSPTW) as follows. As before, we are given a non-negative cost 
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Ce each e € E. For each e a E, we are given a non-negative traversal time 
te- Moreover, for each i € Vr, we are given a non-negative servicing time 
Si, along with a time window [ai,bi]. Finally, we are given a positive time 
T by which the vehicle must return to the depot. All nodes in Vr must be 
visited at least once. On one such visit, the customer must receive service. 
The time at which service begins must lie between Oj and bi. The task is 
to minimise the cost of the tour. We assume without loss of generality that 
the vehicle departs from the depot at time zero. We also assume that the 
vehicle is permitted to wait at any customer node, if it arrives at that node 
before service is due to begin. 

Perhaps surprisingly, the situation here is completely different from those 
of the previous two subsections. To be specific: 

• Lemma[T]does not apply. To see this, set F = {1, . . . , 4}, Vr = {2, 3, 4} 
and E = {{1, 2}, {1, 3}, {2, 4}}, set Ce = te = 1 for all e £ E, set Si = I 
for i G {2,3,4}, and set 02 = 62 = 1, 03 = 63 = 3, and a^ = b^ = 6. 
The unique optimal solution is for the salesman to service nodes 2, 3 
and 4 in that order, and then return to the depot. In this solution, 
the edge {1,2} is traversed 4 times. 

• Theorem [3] does not apply either. In the same example, the total 
number of edge traversals is 8, whereas 2(\V\ — 1) is only 6. 

• In fact it is not even true that the total number of edge traversals is 
bounded by 2\E\, as the same example shows. 

• The only thing that one can say in general seems to be that the total 
number of edge traversals is bounded by (n^ -|- 1)(|V| — 1). (This is so 
since the maximum number of edge traversals between two successive 
occasions of service, or between a service and the vehicle leaving or 
returning to the depot, will never exceed |y| — 1 in an optimal solution.) 

For these reasons, it does not seem possible to adapt the classical, SCF or 
MCF formulations to the STSPTW, and it does not seem desirable to adapt 
the TS formulation, since one would need {ur + 1)(|T^| — 1) time stages. 

On a more positive note, however, there exists a compact formulation 
of the STSPTW that uses only 0{nR\E\) variables and constraints. The 
necessary variable definitions are as follows. For every a (z A and k = 
0,...,nji, let the binary variable x^ take the value 1 if and only if the 
salesman traverses arc a after having serviced exactly k customers so far. 
Also let g^ be a non-negative continuous variable representing the total time 
that has elapsed when the salesman starts to traverse arc a, having exactly 
serviced k customers, or if no such traversal occurs. Finally, for all i S Vr 
and A; = 1, . . . , ur, let the binary variable y^ take the value 1 if and only if 
customer i is the /cth customer to be serviced. 



18 



The objective function is simply: 

min^^CaX^. 

k=OaeA 

To ensure that each required node is serviced exactly once, we have the 
following constraints: 






T.iev^y^ = ^ {k = l,...,nR) 



To ensure that the vehicle departs from and returns to the depot a correct 
number of times, we have: 

I^ae<5+(1) ^a = 1 
HaeS-{l) ^a = J2ae5+{1) ^a (^^ = 1, • • • , n_R - 1) 

Then, to ensure that the vehicle departs from each non-depot node as many 
times as it arrives, we have: 

Eae5- (i) Xa = y} + Eae5+ (i) ^a (V« G Vr) 

yf + EaeS- i^) ^a = v'^' + Eae5+« ^a i^i ^ Vr, k = 1, . . . ,nR - 1) 

vr + T.ae5- W ^a« = Eae5+ (.) ^^ (V^ G l^i?) 

Eae5-(.)^a = Eae5+w5a (Vi G V \ V^ U {!}, /c = 0, . . . , nfl). 

Next, to ensure that the g^ variables take the value that they should, we 
add the following constraint for i G Vr and for /c = 0, . . . , ur — 1: 

and the following constraint for i (z V \Vr and for k = 0, . . . , hr: 



E 3a> E ^.^+ E ^« 

ae<5+(i) aG5-(j) aG<5-(j) 



xt 



Moreover, to ensure that the time windows are obeyed, we add the following 
constraints: 



Eae5+(i) 9a > (ai + Si)y^ (Vi G Vr, A; = 1, . . . , ur) 

{i)ya +J2a€S-(i)'' 



EaeS-U) at' + Ea^S-U) taX^' <T-{T- 6,)yf (Vi G ^fi, /c = 1, . . . , Ur 
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Finally, we have the trivial constraints: 

x^€{0,l} (VaeA, A; = 0,...,nR) 
y,^G{0,l} {yieVR,k = l,...,nR) 
0<9^a<T4 {yaeA,k = 0,...,nR). 

As stated above, this formulation has 0{nR\E\) variables and constraints. 
We leave the existence of a significantly smaller compact formulation as an 
open question. 

We remark that it is not easy to convert the STSPTW into the standard 
TSPTW by computing all-pairs shortest paths. This is because a cheapest 
path between two nodes is not always the same as the quickest path. We 
will address this issue in detail in another paper [25]. 



6 Concluding Remarks 

Our motive for looking at the 'Steiner' version of the TSP and its vari- 
ants was that many real-life vehicle routing problems are defined on road 
networks, rather than complete graphs as normally assumed in the litera- 
ture. Moreover, 'compact' formulations are of interest, not only for their 
elegance, but also because one can just feed them into a standard branch- 
and-bound solver, without having to implement complex solution methods 
such as branch-and-cut. 

We have seen that the classical, single-commodity flow, multi-commodity 
flow and time-staged formulations of the Traveling Salesman Problem can 
all be adapted to the Steiner Traveling Salesman Problem, the Steiner Ori- 
enteering Problem and the Steiner Capacitated Profitable Tour Problem. In 
some cases, we can characterise the projections of the resulting LP relax- 
ations into the space of the 'natural' variables. Moreover, in some cases, the 
formulations can be easily strengthened, without increasing their size. 

On the other hand, it does not seem possible to adapt the above for- 
mulations to the Steiner Traveling Salesman Problem with Time Windows. 
Nevertheless, we have produced a compact formulation of this problem which 
is of reasonable size. 

We believe that all of the formulations presented in this paper are po- 
tentially of practical use. Possible topics for future research would be the 
derivation of smaller and/or stronger compact formulations for the problems 
mentioned, the derivation of useful compact formulations for the Steiner ver- 
sion of other variants of the TSP, and exploring the potential of extending 
the approach to problems with multiple vehicle and/or depots. 
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